Skip to content

add padding option to base64_encode#13698

Closed
remicollet wants to merge 1 commit into
php:masterfrom
remicollet:base64_no_padding
Closed

add padding option to base64_encode#13698
remicollet wants to merge 1 commit into
php:masterfrom
remicollet:base64_no_padding

Conversation

@remicollet

@remicollet remicollet commented Mar 13, 2024

Copy link
Copy Markdown
Member

Make padding optional using

function base64_encode(string $string, bool $padding = true): string {}

Also add

PHPAPI extern zend_string *php_base64_encode_ex(const unsigned char *, size_t, zend_long flags);

php_base64_encode is unchanged

Encoding without padding is needed for ARGON2 password, see PR #13635
this will allow to simplify code in ext/openssl

@francislavoie

francislavoie commented Mar 13, 2024

Copy link
Copy Markdown

I think base64url format would also be nice to have (using -_ instead of +/ and without padding characters). Very common need, I add something like this in most projects:

if (!function_exists('base64url_encode')) {
    /**
     * Base64 encode with only URL-safe characters
     *
     * @param string $data
     * @return string
     */
    function base64url_encode(string $data)
    {
        return rtrim(strtr(base64_encode($data), '+/', '-_'), '=');
    }
}

if (!function_exists('base64url_decode')) {
    /**
     * Bas64 decode with URL-safe characters
     *
     * @param string $data
     * @return string|false
     */
    function base64url_decode(string $data)
    {
        return base64_decode(
            str_pad(
                strtr($data, '-_', '+/'),
                4 - ((strlen($data) % 4) ?: 4),
                '=',
                STR_PAD_RIGHT
            )
        );
    }
}

This alternate encoding is covered by RFC 4648: https://datatracker.ietf.org/doc/html/rfc4648#section-5

@remicollet remicollet changed the title add PHP_BASE64_NO_PADDING option to base64_encode add padding option to base64_encode Mar 14, 2024
@remicollet

remicollet commented Mar 14, 2024

Copy link
Copy Markdown
Member Author

I think base64url format would also be nice to have

Yes, another "URL SAFE" option is interesting, but out of the scope of this PR (and more complex because of multiple optimized variants of the implementation), and also require change to base64_decode.

Comment thread ext/standard/base64.c Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you used a flag for future proofing in case a new option would be needed in the future, right? Personally, I'd be fine with a bool value, but I can also understand your POV :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes (see discussion about URL SAFE mode) and to avoid changing function prototype again in so much places ;)

@kocsismate kocsismate left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! It would be nice if other people could also have a look at it.

@devnexen

Copy link
Copy Markdown
Member

LGTM. seeing the argument about the flag part, I m sold too :)

@remicollet

Copy link
Copy Markdown
Member Author

If nobody cry, I plan to merge this later this week

@bukka bukka left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably go for PHP_BASE64_PADDING (negated variant) but I'm fine with this approach.

@remicollet

Copy link
Copy Markdown
Member Author

rebased

@remicollet

Copy link
Copy Markdown
Member Author

Merged as b5446e4
NEWS in dd6e738

@remicollet remicollet closed this Mar 26, 2024
@remicollet remicollet deleted the base64_no_padding branch March 26, 2024 12:51
@TimWolla

Copy link
Copy Markdown
Member

If nobody cry, I plan to merge this later this week

5 hours between that comment and the merge doesn't really leave time “to cry”. I find it not obvious that a new $padding parameter to base64_encode is the correct choice and an email on the list to a similar effect has been left unanswered: https://externals.io/message/122630#122646 and the conclusion of the previous discussion I've linked in that thread was that an RFC might be necessary to discuss the evolution of the base64 functionality: https://externals.io/message/119243#119253

@remicollet

remicollet commented Mar 26, 2024

Copy link
Copy Markdown
Member Author

email on the list to a similar effect has been left unanswered:

The discussion was about safe_url
I already answer about this: this is ANOTHER feature
and I don't plan to work on it now, especially because of the 5 optimized versions of the functions (per CPU)

I only work on PADDING, because this was needed for another feature, see cfed1f7

@bukka

bukka commented Mar 26, 2024

Copy link
Copy Markdown
Member

If there's an objection, the user API change should be probably reverted and just kept the internal API change that can be still used for another feature.

@bukka

bukka commented Mar 26, 2024

Copy link
Copy Markdown
Member

I guess the objection was more about using a flag for this rather than extra bool parameter. Personally I think that extra bool parameter is better but if there's no clear agreement, it needs to reverted and it needs RFC. I mean just the user API change. Inernal API (_ex function) is fine.

@TimWolla

Copy link
Copy Markdown
Member

The discussion was about safe_url

From my understanding, Robert said that regular base64 without padding is not a thing and that is a relevant complaint about the userland API change.

I guess the objection was more about using a flag for this rather than extra bool parameter.

Another alternative would be an enum specifying the variant:

enum Base64Variant {
  case Standard;
  case Url;
  case UrlUnpadded;
}

Note how that enum would disallow combining the Unpadded variant with the Standard alphabet, which might or might not be relevant.

@bukka

bukka commented Mar 26, 2024

Copy link
Copy Markdown
Member

I wouldn't say that it's not a thing (we don't not require standards for everything) so as soon as there is a use case, it's a thing. And we have got an actual use in core for this. That said I agree that URL variant is more useful for users but don't think we should disallow standard unpadded version.

In any case there is clearly no agreement about this so I think we should revert the user part of this change (not introduce padding argument) as we need an RFC for this. @remicollet can you revert that bit please?

@remicollet

Copy link
Copy Markdown
Member Author

@remicollet can you revert that bit please?

Reverted in 6c5814d

Notice: I don't plan to care care of further changes or RFC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants